🌊 CUDA Streams - miterion · Scour

Heterogeneous Processing: A Strategy for Augmenting Moore's Law (2006)

linuxjournal.com·1d·

Discuss: Hacker News

⚡CUDA Programming Patterns

How Anam Achieved 250% Faster Inference Using Zymtrace Continuous GPU Profiling

zymtrace.com·12h

End-to-End Throughput Benchmarking of Portable Deterministic CNN-Based Signal Processing Pipelines

arxiv.org·9h

CUDA Guide: Workflow for Performance Tuning

digitalocean.com·4d

⚡CUDA Programming Patterns

Leveraging io_uring for performant asynchronous linux applications.

dev.to·14h·

Discuss: DEV

⏱️CUDA Events

Concurrent vs. Parallel Execution in LLM API Calls: From an AI Engineer’s Perspective

pub.towardsai.net·8h

🤖AI Coding Tools

Faster AI Training Unlocked With New System For Massive Language Models

quantumzeitgeist.com·20m

🎯Tensor Cores

Hitting 1,000 tokens per second on a single RTX 5090

blog.alpindale.net·15h·

Discuss: Hacker News

🎛️CUDA Optimization

building cuda-gdb from sources

redplait.blogspot.com·1d·

Discuss: redplait.blogspot.com

⚡CUDA Programming Patterns

Quantized Tensor Train Compression For Turbulent Flow Simulation: O(log N) Scaling with Reynolds-Independent Bond Dimension

zenodo.org·1h·

Discuss: Hacker News

🏎️TensorRT

Graphics Programming Conference

graphicsprogrammingconference.com·22h

How PCIe, NVLink, and NUMA Topology Affect GPU Scheduling Outcomes

dev.to·9h·

Discuss: DEV

📊CUDA Graphs

The Avatar Cache: Enabling On-Demand Security with Morphable Cache Architecture

arxiv.org·9h

⚡CUDA Programming Patterns

Threads, processes and concurrency in Python: some thoughts

artima.com·1d

⚡CUDA Programming Patterns

From Prediction to Compilation: A Manifesto for Intrinsically Reliable AI

news.ycombinator.com·1d·

Discuss: Hacker News

🤖AI Coding Tools

feldera/feldera: The Feldera Incremental Computation Engine

github.com·2d

🏗️Build Optimization

**Abstract** In high‑performance computing (HPC) systems, thermal management of dense GPU caches is a critical bottleneck. Conventional continuum‑based therm...

freederia.com·3d

⚡CUDA Programming Patterns

An introduction to lockless algorithms [LWN.net]

lwn.net·2h

🔲Loop Tiling

Performance Tip of the Week #7: Optimizing for application productivity

abseil.io·1d

⚙️Systems Programming

Main Content || Math ∩ Programming

jeremykun.com·15h

📉Model Quantization

Loading more...